Problems of complex objects modelling can be
solved by deductive logical-mathematical or by inductive
sorting-out methods. Deductive methods have advantages in
the cases of rather simple modelling problems when the
theory of the object being modelled is known and
therefore it is possible to develop a model from
physically-based principles using users knowledge of the
process. Besides these information aspect the praxis
relevant applicability of modelling techniques and tools
act a significant part for their extensive and various
supply at user support. The user is normally interested
in the solution of the initial problem and barely has any
expert knowledge about deductive mathematical modelling.
Efforts of using the known tools of artificial
intelligence were not successful in many cases in the
past. That is, because methods of artificial intelligence
are based on knowledge extraction of human skills in a
subjective and creative domain -model building. Besides
these aspects, in such a way it's not possible to solve
the significant problems of modelling for complex systems
such as inadequate a priori information, great number of
unmeasurable variables, noised and extremly short data
samples and ill-defined objects with fuzzy
characteristics. In this case knowledge extraction from
data, i.e. to derive a model from experimental
measurements using inductive methods has advantages in
cases of rather complex objects having only a few a
priori knowledge.
One development direction that take up the practical
demands represents the self-organization of mathematical
models which is realizable by means of statistical
learning networks like GMDH algorithms.
In classical GMDH algorithms the partial models
have to be chosen wether to be linear or nonlinear
functions in each generated layer. Lemke1 has
developed an algorithm for the generation of optimum
partial models. A complete polynom of second degree will
be optimized
f(xi, xj) =
a0+a1xi+a2xj+a3xixj+a4xi2+a5xj2,
using various selection criteria like the PESS
criterion. In distinction to classical algorithms this
one has the ability to synthesize linear or nonlinear
models of optimal complexity depending on the object
structure and a meaning reduction of model complexity
related to the existing noise level of the data. This
results in a more flexible modelling in each layer
because the partial models could consist of no any
(y=a0), one or both input variables of every
possible combination depending on their actual
contribution. The aim is to avoid for short and very
noisy data samples inclusion of redundant variables in
modelling which, ones part of the model, couldn't be
excluded afterwards. So, at the end it could be expected
to get simpler models.
Successful applications of GMDH algorihms are
known especially in those areas where theoretical systems
analysis is not applicable because of the complicatedness
of the object being examined, the status of knowledge of
the related scientific theory and the required time. An
important area especially for decision support systems is
the analysis and prediction of systems of
characteristics. In the following we present one newer
example in which SelfOrganize!-tool was used.
- 3.1. Solvency Checking
Basis for the examination and automatic model
synthesis were sets of 19 anonymous characteristics of 81
companies which have been served a banking establishment
do decide a company's solvency. 10 decisions have been
chosen from the bank to serve for results checking and
the other 71 decisions were used as learning sets for
modelling. There are several methodologies to obtain the
required models using GMDH but in distinction to neural
networks each of them deliver assertions of the influence
of the individual characteristics on the decision.
A. Model of the dependence of the decision from the
variables
There were generated linear yM=
∑i aixi and nonlinear
static models of the decision variable from the 19
characteristics xi. The decision variable has
been set related to the decision to +1 („positive")
or -1 („negative"). All obtained models has
extracted the variables x5, x8,
x10, x15 as significant, e.g.
yM= -3,4528 + 0,1174 x5 + 0,1701
x15 - 0,551 x8 + 1,311
x10.
These four variables could be interpreted as the main
decision variables.
B. Modelling of independent systems of
equations
An other and more expending way is the generation of
linear respective nonlinear systems of equations
separately for all positive and negative decisions. In
the case of linear models it is
x+=A+x+
;
x-=A-x-
; A={aij} , with aii=0.
Such systems better grasp the spectrum of decisions
because they have a greater breadth of variation and
could be interpreted, too. Then, the corresponding model
values xi+/
xi- will be calculated for the
checking set variables xic . The
membership to class + or - was decided on the basis of
the deviations Δi+ =
xic - xi+
respectively Δi- =
xic - xi- .
The results in table I have been obtained in the
following cases:
TABLE I Classifications obtained from systems
of equations
|
|
|
|
|
|
|
|
|
|
|
|
|
c1
|
c2
|
c3
|
c4
|
c5
|
c6
|
c7
|
c8
|
c9
|
c10
|
y
|
-
|
+/-
|
+
|
-
|
+
|
-
|
-
|
+
|
+
|
-
|
yM
|
+
|
+
|
+
|
-
|
+
|
-
|
-
|
+
|
+
|
-
|
a. s+=∑i
|Δi+ |;
s-=∑i= |Δi-
|.
b. s+=∑iŒN
|Δi+ |;
s-=∑iŒN
|Δi- |,
in which N is the set of indices of those variables
having influence in model A.
c. s+=∑iŒM+
Δi+ xic ;
s-=∑iŒM-
Δi- xic , in
which M+, M- are the sets of
indices of those input variables the best fitting models
were obtained (for positive and negative decisions).
d. A next way for decision making is to calculate for
the variables xic their deviation
Δi+ and Δi-
and classify each variable on the basis of the minimum
deviation. The final decision is made as a sum of all
classifications.
C. Synthesis
A synthesis of different classifications enables to
better describe the wide spectrum of possible decisions
without lost of the explanation component. In table II a
synthesis on the basis of majority decisions is
shown.
TABLE II Synthesis of different
classifications
|
|
|
|
|
|
|
|
checking set
|
target
|
model A
|
model B.b
|
model B.d
|
synthesis
|
value
|
c1
|
-
|
+
|
-
|
-
|
-
|
true
|
c2
|
+/-
|
+
|
+
|
+
|
+
|
true
|
c3
|
+
|
+
|
+
|
+
|
+
|
true
|
c4
|
-
|
-
|
-
|
-
|
-
|
true
|
c5
|
+
|
+
|
+
|
+
|
+
|
true
|
c6
|
-
|
-
|
-
|
-
|
-
|
true
|
c7
|
-
|
-
|
-
|
-
|
-
|
true
|
c8
|
+
|
+
|
+
|
+
|
+
|
true
|
c9
|
+
|
+
|
+
|
-
|
+
|
true
|
c10
|
-
|
-
|
-
|
-
|
-
|
true
|